CSCI E-25¶

Working with Image Data¶

Steve Elston¶

Introduction¶

This lesson will familarize you with the basic concepts of working with image data and some statistical properties of images. Some key points are of this lesson are:

  1. Discrete pixel structure of digital images.
  2. Representation of color and gray scale images.
  3. Intensity distribution of image data.
  4. Equalizing intensity distributions and improving contrast.
  5. Resizing images.

Some test text here.

Scikit-Learn Image: In this lesson we will be using the Scikit-Learn Image package. This package provides many commonly used computer vision algorithms using a consistent Python API. This package follows the API conventions of Scikit-Learn, enabling the application of a rich library of machine learning algorithms. There is excellent documentation available for the Scikit-Learn Image package. You may wish to start by reading through the User Guide. Examples of what you can do with Scikit-Learn Image package can be seen in the Gallery.

Installation: To run the code in this notebook you will need to install Scikit-Learn Image. Follow the installation directions for your operating system.

Scikit-Learn Image and Numpy: Like all Scikit packages this one is built on Numpy. If you are not very familiar with Numpy you can find Tutorials here. A tutorial on using Numpy with Scikit-learn image data objects can be found here.

Loading Data¶

To get started with this lesson, execute the code in the cell below to import the packages you will need.

In [1]:
import skimage as ski
print(ski.__version__)
0.20.0
In [2]:
# conda install scikit-image
In [61]:
import skimage 
from skimage import data
from skimage.filters.rank import equalize, threshold
import skimage.filters as skfilters
from skimage import exposure
from skimage.morphology import disk, square
from skimage.color import rgb2gray, rgb2ycbcr, ycbcr2rgb, rgb2xyz, xyz2rgb, rgb2yuv, yuv2rgb
from skimage.measure import block_reduce
from skimage.transform import resize
from skimage.filters import gaussian
import numpy as np
import numpy.random as nr
from PIL import Image
from scipy import signal
from sklearn.preprocessing import MinMaxScaler 
import math
import matplotlib.pyplot as plt
%matplotlib inline

Structure of a Color Image Object¶

The code in the cell below loads a color image of a human retina, prints the data types and dimensions of the image object, and displays the image. The image is displayed by matplotlib.pyplot.imshow. Execute the code and examine the result.

In [4]:
retina_image = data.retina()
print('The image object is ' + str(type(retina_image)))
print('The pixel values are of type ' + str(type(retina_image[0,0,0])))
print('Shape of image object = ' + str(retina_image.shape))
fig, ax = plt.subplots( figsize=(6, 6))
_=ax.imshow(retina_image)
The image object is <class 'numpy.ndarray'>
The pixel values are of type <class 'numpy.uint8'>
Shape of image object = (1411, 1411, 3)

The image object has 3-dimensions, the two spatial dimensions and the 3 color channels. Examine this image noticing the wide variation in color and intensity. Notice also that the illumination of the retina does not appear uniform, resulting in a bright spot on the left and a darker region on the right.

Exercise 1-1: Complete the code for the function in the cell below to display the 3 color channels of the image and the original image in a 2x2 array using the matplotlib.pyplot.imshow function. The color channels are in red, green, blue order and should be displayed as gray scale using the cmap=plt.get_cmap('gray') argument. Your function should label the channels and the original image. Execute your function and examine the results.

In [5]:
def plot_3_color_channels(img):
    '''Function plots the three color channels of the image along with the complete image'''
    fig, ax = plt.subplots(2, 2, figsize=(12, 12))
    ax = ax.flatten()

    # Display each color channel
    for i, channel_name in enumerate(['Red Channel', 'Green Channel', 'Blue Channel']):
        channel_img = img[:, :, i]
        ax[i].imshow(channel_img, cmap=plt.get_cmap('gray'))
        ax[i].set_title(channel_name)

    # Display the original image
    ax[-1].imshow(img)
    ax[-1].set_title('Original Image')

    plt.show()

# Call the function with the retina image
plot_3_color_channels(retina_image)
In [6]:
# Calculate the average intensity of each color channel
average_intensity = [np.mean(retina_image[:,:,i]) for i in range(3)]

# Find the index of the channel with the highest intensity
max_intensity_channel_index = np.argmax(average_intensity)
print("Channel with the greatest intensity:", ['Red', 'Green', 'Blue'][max_intensity_channel_index])
Channel with the greatest intensity: Red

Examine the intensity (brightness) of the color channels and answer these questions:

  1. Which channel has the greatest intensity, and does this make sense given the image?
  2. Is it likely that the saturation of the red channel arises as an artifact of the illumination spot on the left of the retina image?
    End of exercise.

Answers:

  1. The Red channel has the greatest intensity as shown by the code above which calculates the average intensity of each channel in the image and finds the index of the channel with the highest average intensity. It does make sense given the original retina image has blood vessels, leading to the red channel having the gretest intensity
  2. Yes it is likely that the saturation of the red channel arises as an artifact of the illumination spot
    on the left of the retina image. This can be due to factors such as light reflection i.e. the image reflects more red light, color balance that emphasizes more reddish tones, or sensitivity of sensors to red wavelength

When working with digital images it is always important to keep in mind the discrete nature of the samples. To demonstrate the discrete nature of a digital image you can visualize a 100 pixel, or $10 \times 10$, sample from the larger image by executing the code in the cell below.

In [7]:
plot_3_color_channels(retina_image[600:610,300:310,:])

Notice the discrete nature in each of the three color channels and the color image. The sum of these discrete color-channel pixel intensities yields the color image.

Statistical Properties of an Image¶

The next question is, what is the distribution of pixel intensities in the 3 color channels of the image? Histograms and cumulative density functions are used to analyze these distributions. The code in the cell below plots the histograms of the 3 color channels along with their cumulative distributions. Execute this code and examine the results.

In [8]:
def plot_image_distributions(img, xlim=(0,255)):
    '''Function plots histograms of the three color channels of the image along 
    with the cumulative distributions'''
    fig, ax = plt.subplots(2,2, figsize=(12, 12))
    ax = ax.flatten()
    titles=['Red','Green','Blue']
    for i in range(3):
        ax[i].hist(img[:,:,i].flatten(), bins=50, density=True, color=titles[i], alpha=0.3)
        ax[i].set_title(titles[i])
        ax[i].set_xlabel('Pixel value')
        ax[i].set_ylabel('Density')
        ax[i].set_xlim(xlim)
        ax[3].hist(img[:,:,i].flatten(), bins=50, density=True, cumulative=True, color=titles[i], histtype='step', label=titles[i])
        ax[3].set_xlim(xlim)
    ax[3].set_title('Cumulative distributions')  
    ax[3].set_xlabel('Pixel value')
    ax[3].set_ylabel('Cumulative density')  
    plt.legend(loc='upper left')
    
plot_image_distributions(retina_image)    

There are several properties of the distribution of the pixel values which are important:

  1. The distribution of the intensity for of the red channel has clearly higher values than the other channels.
  2. A significant fraction of pixel values have 0 intensity for all 3 color channels. These pixels are primarily black background around the retina, but may also represent the dark pupil spot in the center of the retina.
  3. A few red channel pixels have the maximum value of 255. The red intensity of these pixels is said to be saturated.

Improving Contrast¶

Our next question to address is what is the ideal distribution of the intensity values of an image? A useful, and obviously answer, is that we want the pixel values over the full range of possible values. For unsigned integer values, {0,255}. Further, the distribution of pixel values should be uniform. For the $n=256$ unsigned integer values the probability mass function, or PMF, of the $ith$ value is:

$$p(i) = \frac{1}{n}$$
  

And the cumulative density function, or CDF, of the uniform distribution at the $ith$ value is, $x_i$:

$$CDF(i) = \sum_{i=0}^{n-1} \frac{1}{x_i}$$

We can visualize an example of a gray-scale image of unsigned integers on the range {0,255} with random uniform distributed pixel values. The code in the cell below forms a gray-scale image randomly sampled uniform distributed pixel values and displays the result.

In [9]:
random_image = np.multiply((nr.uniform(low=0.0, high=255.0, size=retina_image.shape[0]*retina_image.shape[1])), 255).reshape((retina_image.shape[0], retina_image.shape[1]))
random_image = random_image.astype(np.uint8)
print(random_image.shape)
fig, ax = plt.subplots( figsize=(6, 6))
_=ax.imshow(random_image, cmap=plt.get_cmap('gray'))
(1411, 1411)

To view the distribution of the pixel values of this image execute the code in the cell below.

In [10]:
def plot_gray_scale_distribution(img):
    '''Function plots histograms a gray scale image along 
    with the cumulative distribution'''
    fig, ax = plt.subplots(1,2, figsize=(12, 5))
    ax[0].hist(img.flatten(), bins=50, density=True, alpha=0.3)
    ax[0].set_title('Histogram of image')
    ax[0].set_xlabel('Pixel value')
    ax[0].set_ylabel('Density')
    ax[1].hist(img.flatten(), bins=50, density=True, cumulative=True, histtype='step')
    ax[1].set_title('Cumulative distribution of image')  
    ax[1].set_xlabel('Pixel value')
    ax[1].set_ylabel('Cumulative density') 
    plt.show()

plot_gray_scale_distribution(random_image)    

Exercise 1-2: To compare the pixel value distribution of the retina image to the ideal values do the following:

  1. Create a gray-scale image object named retina_gray_scale using the skimage.color.rgb2gray function.
  2. Print the dimensions of the image object.
  3. Display the gray-scale image. Make sure the image is large enough to see the details.
  4. Plot the distribution of the pixel values of the gray-scale image.
In [11]:
def plot_grayscale(img):
    fig, ax = plt.subplots(figsize=(6, 6))
    _ = ax.imshow(img, cmap=plt.get_cmap('gray'))
    plt.show()

# Create a gray-scale image object named retina_gray_scale
retina_gray_scale = skimage.color.rgb2gray(retina_image)

# Print the dimensions of the image object
print("Dimensions of the gray-scale image object:", retina_gray_scale.shape)

# Display the gray-scale image
plot_grayscale(retina_gray_scale)

# Plot the distribution of the pixel values of the gray-scale image
plot_gray_scale_distribution(retina_gray_scale)
Dimensions of the gray-scale image object: (1411, 1411)
In [12]:
retina_image.dtype
Out[12]:
dtype('uint8')

Examine the distribution plots and answer these questions:

  1. How would you describe these results with respect to the ideal distribution?
  2. How do you think the range of pixel values limit the contrast of the image?
    End of exercise.

Answer:

  1. The pixel values for the histogram seems to be skewed which indicates an imbalance in pixel values and is not ideal. Moreover in the Cumulative distribution of image graph, we observe sudden jumps or plateaus, suggesting abrupt changes in pixel values, which is not ideal.
  2. The range of pixel values from 0 to 1 covers the entire dynamic range of the image. However, the contrast is reduced. It also leads to reduced perceptual contrast as the human eyes are more sensitive to changes in mid-tones rather than at the extremes of the intensity scale. The pixel value range of 0 to 1 might also be a challenge when applying contrast enhancement techniques.

Histogram Equalization¶

Contrast of an image is range between the minimum and maximum pixel values of an image. The larger the range of values the more distinctive the differences in the image will be. To improve the contrast in an image we need to equalize the pixel values over the maximum of the range. The goal is to find a transformation that stretches the pixel values into a uniform distribution. This process is know as histogram equalization.

Histogram equalization can be performed in a number of ways. The obvious algorithm is global histogram equalization. The pixel values are transformed to equalize the histogram across the entire image. However, if illumination is inconsistent across the image, global equalization will not be optimal. An alternative is to perform local histogram equalization over small areas of the image. This method is known as adaptive histogram equalization. Adaptive equalization can compensate for uneven illumination across the image.

Exercise 1-3: You will now apply both common types of histogram equalization to the gray-scale retina image. Use both the skimage.exposure.equalize_hist function and the skimage.exposure.equalize_adapthist function. Create and execute a function which does the following:

  1. Executes the equalization function passed as an argument. Pass the function name as a string using the Python eval function.
  2. Display the equalized gray-scale image using the plot_grayscale() function.
  3. Plot the distribution of the pixel values of the equalized gray-scale image using the plot_gray_scale_distribution() function.

Execute your function for the two equalization algorithms. You can do this by iterating over a list of the function names. Print a line indicating which function is being executed. Save the results in a Numpy object named retina_gray_scale_equalized.

In [69]:
def test_equalize(img, func):
    print("Executing function:", func)
    img_equalized = eval(func)(img)
    plot_grayscale(img_equalized)
    plot_gray_scale_distribution(img_equalized)
    return img_equalized

# List of equalization function names
equalization_functions = ["skimage.exposure.equalize_hist", "skimage.exposure.equalize_adapthist"]

# Apply equalization functions and save the results
retina_gray_scale_equalized = []
for func_name in equalization_functions:
    result = test_equalize(retina_gray_scale, func_name)
    retina_gray_scale_equalized.append(result)

# Convert the list to a numpy array
retina_gray_scale_equalized = np.array(retina_gray_scale_equalized)
Executing function: skimage.exposure.equalize_hist
Executing function: skimage.exposure.equalize_adapthist

Answer the following questions:

  1. Compare the unequalized and equalized images. What aspects of the of the images are more apparent with the improved contrasted.
  2. Compare the distributions of pixel values between the unequalized image, the random uniformly distributed image, and equalized images (2). Which of these histograms look the most similar and what does this tell you about the contrast of the image.
  3. Does the difference in the distribution between the locally equalized image and the globally equalized image make sense and why?
    End of exercise.

Answers:

  1. Several aspects become better for the equalized image compared to the unequalized image such as: a) Enhanced contrast across the entire retina image b) Improved visibility of details especially in areas that were previously shadowy c) More balanced lighting leading to a more uniform appearance
  2. The unequalized image has a skewed histogram for distributions of pixel values, the random uniformly distributed image has a flat histogram with pixel values evenly distributed across the density scale, and the equalized image has a more balanced and uniform density distribution due to histogram equalization
  3. Yes the difference in distribution between locally equalized and globally equalized images makes sense. The local histogram equalization operates on small regions within the image rather than the entire image, computing separate histograms, equalization functions and local contrast for each patch. In contrast, the global histogram equalization considers the complete image and redistributes pixel values to enhance contrast.

Equalization for Multi-Channel Images¶

Contrast improvement, including histogram equalization, cannot be directly applied to the individual color channels of an RGB image. For RGB images the intensity of each color channel is mathematically unconstrained by the other two. However in reality, the brightness or intensity of each pixel depends on the value of all three channels. Therefore, independently applying 2-dimensional equalization to an RGB image causes normalization problems.

A common approach is to transform an RGB image into one of several possible formats that use a 2-dimensional color space map or chromatisity map of image intensity. There are a great many such choices, a number of which are supported in the skimage.color package.

As an example, the CIE 1931 or XYZ is a map of luminance, Y, over a two dimensional space of chromatiity. Z is the quasi-equal to blue, and X is a mix of the RGB values. The XYZ color space is shown in the figure below.

Drawing

XYZ Color Space

Exercise 1-4: To apply the sklearn.exposure.equalize_adapthist function to a color image the following steps are used:

  1. The skimage.color.rgb2xyz function is used to convert the RGB image to XYZ format.
  2. The color channels of the transformed image are displayed.
  3. The equalized XYZ image is converted to RGB and the color channels and densities are displayed.

    Execute the code and examine the results.

In [35]:
retina_xyz = rgb2xyz(retina_image)
plot_3_color_channels(retina_xyz)

for i in range(3):
    retina_xyz[:,:,i] = exposure.equalize_adapthist(retina_xyz[:,:,i])

retina_rgb = xyz2rgb(retina_xyz)
plot_3_color_channels(retina_rgb)
plot_image_distributions(retina_rgb, xlim=(0.0,1.0))  

Compare the equalized RGB images and pixel value densities to the images and densities of the original image and answer the answer the following questions:

  1. Did the histogram equalization achieve the goal of improving the contrast of the image both in the color channels and for the 3-channel color image, and why?
  2. Given the use of the locally adapted histogram equalization algorithm, does the distribution of the pixel values in the 3 channels of the equalized image make sense, and why?
  3. What is the evidence of saturation of the red color channel after equalization?
  4. Does the change in color of the 3-channel color image make sense given the histogram equalization, and why?
    End of exercise.

Answers:

  1. Yes, the histogram equalization achieves the goal of improving the contrast of the image both in the color channels and for the 3-channel color image as there are improvements in contrast individual color channels as well as in the 3-channel color, better clarity of details, and overall image appearance
  2. Yes, the distribution of pixel values in the 3 channels of the equalized image after using locally adapted histogram equalization algorithm makes sense because it reflects the local contrast enhancement and intensity adjustments performed to address spatial variations and enhance details in different regions of the image. The algorithm preserves color balance while effectively improving contrast, resulting in better looking and more informative images.
  3. Evidence of saturation in the red color channel after equalization can be observed by histogram peaks of pixel values at the extreme right end of the intensity scale. Moreover the red histogram is truncated at the maximum intensity value suggesting that that pixel values have reached saturation.
  4. The change in color of the 3-channel color image after histogram equalization makes sense as it leads to improvements in contrast, brightness, and color balance for the red, blue and green channel.

Rank equalization¶

Contrast improvement is such an important data preparation step for computer vision that many algorithms have been proposed. One approach is to use rank statistics over a small region or local region of the image. The skimage.filters.rank.equalize function implements just such an algorithm. Execute the code in the cell below to see the effect this algorithm has on the gray-scale retina image. |

In [36]:
retina_rank_equalized = equalize(np.multiply(retina_gray_scale, 255).astype(np.uint8), footprint=disk(9))
plot_grayscale(retina_rank_equalized)

This locally equalized image shows considerably more detail than the global histogram equalization or adaptive histogram equalization methods. But, is this what we really want? In some cases yes. If fine details like texture are important to the computer vision solution, this equalization would be preferred. However, too much detail might lead to unnecessary complexity if the goal was to identify major structural elements of the image. In summary, the correct preprocessing for an image depends on the other algorithms one intends to apply.

Other Contrast Adjustments¶

Besides histogram equalization, numerous mathematical transformations for improving contrast have been developed. These methods seek to improve contrast by a nonlinear transformation of the pie values. We will examine just of few of the many possibilities:

  • Gamma adjustment is a power law transformation that shifts the histogram of the pixel values. For input pixel values $x_i$, and power, $\gamma$, the output pixel values are computed $x'_i = gain * x_i^{\gamma}$, were gain is an optional scale adjustment. If $\gamma < 1$ the histogram shifts to the right and for $\gamma > 1$ the histogram shifts to the left.
  • Logarithmic adjustment computes a logarithmic compression of the pixel values, $x_i$, $x'_i = gain * log(x_i + 1)$, where gain is an optional scale adjustment.
  • Sigmodal adjustment is a nonlinear transformation of the pixel values, $x_i$, with a cutoff value, $x'_i = \frac{1}{1 + exp(gain * (cutoff - x_i))}$, and an optional gain adjustment.

Exercise 1-5: To get a feel for the gamma adjustment method you will now do the following:

  1. Iterate over gamma values of $[0.5, 2.0]$.
  2. Apply the gamma adjustment skimage.exposure.adjust_gamma function to the gray scale retina image.
  3. Display the adjusted image and the pixel value density. Make sure you include a printed indication of gamma for each case.
  4. Execute your code.
In [37]:
gamma_values = [0.5, 2.0]

for gamma in gamma_values:
    adjusted_image = exposure.adjust_gamma(retina_gray_scale, gamma)
    print(f"Gamma value: {gamma}")
    plt.figure(figsize=(10, 5))

    # Display adjusted image
    plt.subplot(1, 2, 1)
    plt.imshow(adjusted_image, cmap='gray')
    plt.title(f"Adjusted Image (Gamma = {gamma})")
    plt.axis('off')

    # Display pixel value density
    plt.subplot(1, 2, 2)
    plt.hist(adjusted_image.flatten(), bins=256, range=(0, 255), density=True, color='gray', alpha=0.7)
    plt.title("Pixel Value Density")
    plt.xlabel("Pixel Value")
    plt.ylabel("Density")

    plt.show()
Gamma value: 0.5
Gamma value: 2.0

Examine your results for the values of gamma, comparing them to the original gray-scale image. How does the brightness of the image and the distribution of pixel values change with gamma?
End of exercise.

Answer: We observe that adjusting the gamma value affects the overall brightness and contrast of the image by altering the distribution of pixel values. For a gamma value of 0.5, the adjusted image appears lighter compared to the original gray-scale image, whereas for gamma value of 2.0, the adjusted image appears darker than the original. Moreover the Pixel Value for both gamma values is 0 in the RHS graph whereas we observe a spread out histogram with range from 0 to 0.8 for the original gray-scale image

Exercise 1-6: To get a feel for the sigmodial adjustment method you will now do the following:

  1. Iterate over cutoff values of $[0.3,0.4,0.5]$.
  2. Apply the gamma adjustment skimage.exposure.adjust_sigmoid function to the gray scale retina image.
  3. Display the adjusted image and the pixel value density. Make sure you include a printed indication of gamma for each case.
  4. Execute your code.
In [38]:
cutoff_values = [0.3, 0.4, 0.5]

for cutoff in cutoff_values:
    # Apply sigmoidal adjustment
    adjusted_image = exposure.adjust_sigmoid(retina_gray_scale, cutoff=cutoff)
    print(f"Cutoff value: {cutoff}")
    plt.figure(figsize=(10, 5))

    # Display adjusted image
    plt.subplot(1, 2, 1)
    plt.imshow(adjusted_image, cmap='gray')
    plt.title(f"Adjusted Image (Cutoff = {cutoff})")
    plt.axis('off')

    # Display pixel value density
    plt.subplot(1, 2, 2)
    plt.hist(adjusted_image.flatten(), bins=256, range=(0, 255), density=True, color='gray', alpha=0.7)
    plt.title("Pixel Value Density")
    plt.xlabel("Pixel Value")
    plt.ylabel("Density")

    plt.show()
Cutoff value: 0.3
Cutoff value: 0.4
Cutoff value: 0.5

Examine the images and the pixel value densities for the resulting images and compare these to the original gray-scale image. How does the brightness and densities change with the cutoff value? Pay attention to expansion or compression of the range of pixel values.

Answer: We observe that compared to the original gray-scale image, the lowest cutoff value (0.3) increases contrast and brightness in darker regions of the image while preserving or slightly enhancing brightness in lighter regions, leading to an overall increase in perceived brightness. The opposite happens as the cutoff values increases to 0.4 and 0.5. Sigmoidal adjustment enables us to control the contrast and brightness more selectively compared to gamma adjustment, and the cutoff parameter determines the midpoint of the sigmoid function. Lower cutoff values result in more contrast enhancement, while higher cutoff values result in less enhancement.

Binary Images¶

Many computer vision algorithms operate on binary images. Primarily these methods are in the category of morphology, which we will explore later. A binary image has only two values, $\{positive, negative \}$ or $\{ 1, 0 \}$.

Exercise 1-7: You will complete a function named transform2binry() to convert either a 3-channel color image or gray scale image to a integer binary image, $\{ 1, 0 \}$, given a threshold value in the range $0 \le threshold \le 1$ as an argument. The function must do the following:

  1. If the image is multi-channel, convert it to gray-scale.
  2. Transform the threshold value to the fraction of the range of the pixel values. Print the transformed threshold value.
  3. Apply the threshold to the gray-scale pixel values and return the binary images.
  4. Execute your function on the locally equalized color retina image, print the dimensions of the binary image, and display the image, using a threshold value of 0.37.
  5. Execute your function on the locally equalized gray scale retina image, print the dimensions of the binary image, and display the image, using a threshold value of 0.37.
In [39]:
def transform2binary(image, threshold):
    # Convert to grayscale if the image is multi-channel
    if len(image.shape) == 3:
        image_gray = rgb2gray(image)
    else:
        image_gray = image

    # Transform the threshold value to the fraction of the range of the pixel values
    threshold_value = threshold * (np.max(image_gray) - np.min(image_gray))

    print(f"Transformed threshold value: {threshold_value}")

    # Apply thresholding to convert to binary image
    binary_image = np.where(image_gray > threshold_value, 1, 0)

    return binary_image

# Example usage on the locally equalized color retina image
binary_color_image = transform2binary(retina_rank_equalized, threshold=0.37)
print("Dimensions of binary color image:", binary_color_image.shape)
plt.imshow(binary_color_image, cmap='gray')
plt.title("Binary Color Image")
plt.axis('off')
plt.show()

# Example usage on the locally equalized grayscale retina image
binary_gray_image = transform2binary(retina_gray_scale, threshold=0.37)
print("Dimensions of binary grayscale image:", binary_gray_image.shape)
plt.imshow(binary_gray_image, cmap='gray')
plt.title("Binary Grayscale Image")
plt.axis('off')
plt.show()
Transformed threshold value: 93.98
Dimensions of binary color image: (1411, 1411)
Transformed threshold value: 0.3414952
Dimensions of binary grayscale image: (1411, 1411)

Examine the image and answer the following questions.

  1. Does the binary image created from the equalized color image capture key aspects of the retina and why?
  2. Compare the binary images created from the equalized color image and the equalized gray scale image. Is there any difference, and is this the result you would expect?
    End of exercise.

Answers:

  1. Yes, the binary image created from the equalized color image captures key aspects of the retina due to having a much higher transformed threshold value that effectively separates key features of the retina from the background and captures important aspects of the retina.
  2. There is much difference between the binary images created from the equalized color image and the equalized gray scale image due to color information, contrast enhancement, and threshold value selection.

In the foregoing exercise, the threshold for the decision classifying pixel values as true or false, $\{ 0, 1 \}$ was set manually by trial and error. There are numerous algorithms which have been devised for finding thresholds. In general, these algorithms attempt to find an optimal threshold using various measures. Ideally, these algorithms search for a low frequency point in the pixel value histograms which can be used to divide the values.

Exercise 1-8: We can create a binary image using one of the many established algorithms to compute a threshold. In this case Otsu's threshold algorithm. Use this function to find a threshold, apply the threshold to the the equalized gray-scale image to compute a binary image, and plot the result.

In [47]:
# # Convert the equalized grayscale image to gray scale if it's in color
# gray_image = rgb2gray(retina_gray_scale_equalized)

# # Compute Otsu's threshold
# thresh = threshold_otsu(gray_image)

# # Generate binary image based on the computed threshold
# binary_image = gray_image > thresh

# # Plot the binary image
# plt.imshow(binary_image, cmap='gray')
# plt.title('Binary Image (Otsu\'s Threshold)')
# plt.axis('off')
# plt.show()
In [68]:
print("Shape of retina_gray_scale:", retina_gray_scale_equalized.shape)
Shape of retina_gray_scale: (2, 1411, 1411)

How does this binary image compare the to ones computed with the threshold found by trail-and-error, and why?
End of exercise.

Answer: I am getting Value Error mentioning that the input array must have size 3 along channel_axis, and it got (2, 1411, 1411)

Inversion of Images¶

For some machine vision algorithms it is easier or more effective to work with the inverse image or negative of the image. The concept is simple. Pixel values are generally restricted to a range like $\{ 0 - 255 \}$ for unsigned integer representation or $\{ 0.0 - 1.0 \}$ for a floating point image. The given an intensity $P_{i,j}$ of the $ij$th pixel, the inverted intensity, $I_{i,j}$, is then:

$$I_{i,j} = max\big[ P \big] - P_{i,j}$$

Where, $max\big[ P \big]$ is the largest value the representation of the image allows, typically 255 or 1.0.

Exercise 1-9: You will now write a function named invert_image that will perform image inversion on both 3-channel and gray-scale images. Make sure you find the correct maximum value for the data type of the image, 255 for unit8 or 1.0 float.

Now, apply your function to the original color retina image and display the image along with the density plot.

In [53]:
def invert_image(image):
    # Check the data type of the image
    if image.dtype == np.uint8:
        max_val = 255
    elif image.dtype == np.float:
        max_val = 1.0
    else:
        raise ValueError("Unsupported data type.")

    # Invert the image
    inverted_image = max_val - image

    return inverted_image

# Example usage on the original color retina image
inverted_retina_image = invert_image(retina_image)

# Plot histograms for each color channel of the inverted image
plt.figure(figsize=(12, 6))

# Compute histograms for each color channel
for i in range(3):  # Iterate over each color channel (Red, Green, Blue)
    hist, bins = np.histogram(inverted_retina_image[:,:,i].flatten(), bins=256, range=(0, 255))
    plt.subplot(2, 3, i+1)
    plt.plot(hist, color=['red', 'green', 'blue'][i])
    plt.title(f'Inverted Image - {["Red", "Green", "Blue"][i]} Channel')

# Display the inverted image
plt.subplot(2, 3, 4)
plt.imshow(inverted_retina_image)
plt.title('Inverted Retina Image')
plt.axis('off')

plt.tight_layout()
plt.show()

Answer the following questions:

  1. Compare the distribution of the pixel values for the three color channels of the inverted image with the distributions for the original image. Do the distributions for the inverted image make sense given the original values and why?
  2. Do you think the color of the 3-channel inverted image is correct and why?

Next, apply your function to the adaptive histogram equalized gray-scale retina image and display the image along with the distribution plot.

In [67]:
# inverted_retina_grayscale = invert_image(retina_gray_scale_equalized)
# print(inverted_retina_grayscale.shape)
# plot_grayscale(inverted_retina_grayscale)
# plot_gray_scale_distribution(inverted_retina_grayscale) 
  1. The difference in pixel value distributions between the inverted gray scale and original adaptive histogram distributions are subtle. What key difference can you identify?
    End of exercise.

Answers:

  1. We need to analyze the histograms of the original retina image and the inverted image to compare the distributions of pixel values for the three color channels (Red, Green, Blue). We observe that the distribution of red channel is towards the right while that of the green and blue is towards the left in the original retina image, whereas its opposite for each channel color in the inverted retina color. So the distributions for the inverted image make sense given the original values
  2. Yes the color of the 3-channel inverted image is correct. When we invert an image, we essentially reverse the intensity values i.e. darker areas become lighter, and lighter areas become darker. However, the color information remains the same in terms of hue and saturation, only the brightness is affected.
  3. I am getting a TypeError as follows: Invalid shape (2, 1411, 1411) for image data

Sampling and resizing images¶

For many computer vision processes the dimensions of an image must be transformed. We have already explored removing the multi-channel color dimension from an image to form the gray-scale image. Now, we will investigate transformation the pixel row and column dimensions of an image. There are two options:

  1. Down-sample: A down sampled image has a reduced number of pixels. If the multiple between the pixel count of the original image and the down-sampled image an even number, sampled pixel values are used. Otherwise, interpolation is required for arbitrary multiples. Inevitably, down-sampling will reduce the resolution of the image, and fine details will be lost.
  2. Up-sample: The number of samples can be increased by interpolation between the pixel values. The interpolated values fill in the values of the new pixel. If the pixel count of the up-sampled image is not an even multiple of the original image most of the values will be interpolated. While up-sampling can increase the number number of the pixels, this process cannot increase the resolution of an image.

Exercise 1-10: You will now resize the adaptive histogram equalized gray-scale image. Using the skimage.transform.resize function, do the following:

  1. Down-sample the image to dimension $(64,64)$. Print the dimensions and display the resulting image.
  2. Up-sample the down-sampled image to dimension $(1024,1024)$. Print the dimensions and display the resulting image.
In [56]:
# Down-sample the image to dimension (64, 64)
downsampled_image = resize(retina_gray_scale, (64, 64))

# Print the dimensions of the down-sampled image
print("Dimensions of down-sampled image:", downsampled_image.shape)

# Display the down-sampled image
plt.imshow(downsampled_image, cmap='gray')
plt.title('Down-sampled Image (64x64)')
plt.axis('off')
plt.show()
Dimensions of down-sampled image: (64, 64)
In [57]:
# Up-sample the down-sampled image to dimension (1024, 1024)
upsampled_image = resize(downsampled_image, (1024, 1024))

# Print the dimensions of the up-sampled image
print("Dimensions of up-sampled image:", upsampled_image.shape)

# Display the up-sampled image
plt.imshow(upsampled_image, cmap='gray')
plt.title('Up-sampled Image (1024x1024)')
plt.axis('off')
plt.show()
Dimensions of up-sampled image: (1024, 1024)

Notice the changes in resolution of the down-sampled and up-sampled images.

  1. How is the reduction in resolution of the $(64,64)$ image exhibited?
  2. Does up-sampling to $(1024,1024)$ restore the resolution of the image or simply blur the 'pixelation' visible in the $(64,64)$ image?

Answers:

  1. The down-sampled image appears less sharp and detailed compared to the original gray scale image. Some of the finer are blurred due to the reduction in pixel density. This leads to loss of information as well.
  2. We observe that upsampling to (1024, 1024) does not restore the original resolution of the image. While it does reduce the visible pixelation present in the down-sampled image, it does not restore the lost information discarded during downsampling.

Sampling and Aliasing in Images¶

As should be clear from the foregoing, the digital images we work with for computer vision are discretely sampled in the 2-dimensional plane. The discrete pixel values are $v_{\mathbf{x}}$ are the result of this sampling. This discrete sampling limits the spatial resolution which can be captured in the image. If the samples are spaced too far apart, aliasing will occur. For sinusoidal components of the image the Nyquist Shannon sampling theorem the sampling frequency must be at least 2 time the frequency of this component. The sampling rate of 2 times the highest frequency component is known as the Nyquist Frequency. Sampling below the Nyquist frequency leads to aliasing.

We can demonstrate the concept of aliasing with a simple example. The example is based on an initial image and three down-sampled versions:

  1. The initial image has diagonal slashes with sinusoidal amplitudes and dimension $(256,256)$.
  2. The image is down-sampled to dimension $(256, 256)$.
  3. The image is down-sampled to dimension $(128, 128)$.
  4. The image is down-sampled to dimension $(64, 64)$.
In [58]:
dim = 256
x = np.arange(0, dim*dim, 1)
sin_image = 1.0 - 0.5 * np.sin(np.divide(x, math.pi)).reshape((dim,dim))

fig, ax = plt.subplots(2,2, figsize=(12, 12))
ax = ax.flatten()
for i,dim in enumerate([256,128,64,32]):
    sampled_image = resize(sin_image, (dim,dim))
    _=ax[i].imshow(sampled_image, cmap=plt.get_cmap('gray'))
    _=ax[i].set_title('Dimension = (' + str(dim) + ',' + str(dim) + ')')

Examine the images above and notice the following:

  1. The $(128, 128)$ down-sampled image retains the characteristics of the initial image. Look carefully, you can see a slight blurring.
  2. The $(64, 64)$ down-sampled image retains the sinusoidal slash structure. Coarse pixelation is now quite evident and the sampling is very close to the Nyquist frequency.
  3. The $(32, 32)$ down-sampled image does not resemble the initial image at all, exhibiting significant aliasing. Run your eye side to side and up and down on the image. You may see patterns that are not representative of the original image. Such false patterns are a common artifact arising from aliasing.

How can aliasing be prevented? A filter can remove the high frequency components of the image which would lead to the aliasing. A common approach is to use a Gaussian filter. This filter removes high frequency components and has the effect of blurring the image.

Exercise 1-11: You will now investigate how filtering can be applied to prevent aliasing. The skimage.transform.resize function applies a Gaussian filter to prevent aliasing by default. The standard devision of the Gaussian filter, or filter span, can be set to adjust the bandwidth of the filter.
In this exercise you will resample the adaptive histogram equalized gray-scale retina image using the skimage.transform.resize function with the anti_aliasing=False argument. You will use the skimage.filters.gaussian to limit the bandwidth of the image. Do the following to compare the results of different filter bandwidths:

  1. Computer a scale factor, $sf = \sqrt{\frac{original\ dimension}{reduced\ dimension}}$.
  2. Apply the Gaussian filter with $sigma = mutltiplier * sf$ for multiplier in $[0,1,2,3,4]$ (range(5)) and resize the image to $(64,64)$ pixels.
  3. For each value of sigma display the image with a title indicating the value of sigma. You may find interpretation easier if you plot the images on a $3 \times 2$ grid.
In [66]:
# Original and desired dimension of the image
original_dimension = max(retina_gray_scale.shape)
reduced_dimension = 64

# Compute the scale factor
scale_factor = (original_dimension / reduced_dimension) ** (1/2)

# Generate a range of multiplier values for sigma
multiplier_values = range(5)
plt.figure(figsize=(10, 10))

for idx, multiplier in enumerate(multiplier_values, 1):
    # Compute sigma value
    sigma = scale_factor * multiplier

    # Apply Gaussian filter
    filtered_image = gaussian(retina_gray_scale, sigma=sigma)

    # Resize the filtered image
    resized_image = resize(filtered_image, (64, 64), anti_aliasing=False)

    # Plot the image
    plt.subplot(3, 2, idx)
    plt.imshow(resized_image, cmap='gray')
    plt.title(f"Sigma = {sigma:.2f}")
    plt.axis('off')

plt.tight_layout()
plt.show()

Answer the following questions.

  1. How does the aliasing change with increasing sigma, decreasing bandwidth? Is this the behavior you expect and why?
  2. How does the blurring of the image change with increasing sigma, decreasing bandwidth? Is this the behavior you expect and why? End of exercise.

Answers:

  1. We observe from the above images that increasing sigma reduces aliasing in the resized images. As sigma increases, the images become smoother and aliasing become less prominent, resulting in slightly better image quality and reduced distortion as well as pixelation.
  2. As the sigma value increases, the blurring of the image becomes more visible. This is because higher sigma values cause the Gaussian filter to have a wider spatial extent, resulting in more smoothing or blurring of the image details.

Copyright 2021, 2022, 2023 2024, Stephen F Elston. All rights reserved.¶

In [ ]: